STEIN'S METHOD, HEAT KERNEL, AND LINEAR 
FUNCTIONS ON THE ORTHOGONAL GROUPS 

JASON FULMAN AND ADRIAN ROLLIN 

Abstract. Combining Stein's method with heat kernel techniques, we 
study the function Tr(AO), where A is a fixed nx n matrix over R such 
that Tr(AA t ) = n, and O is from the Haar measure of the orthogonal 
group 0(n, R). It is shown that the total variation distance of the ran- 
dom variable Tr(AO) to a standard normal random variable is bounded 
by -^rf, slightly improving the constant in a bound of Meckes, which 
was obtained by completely different methods. 



1. Introduction 

Let 0(n,M) denote the group ofnxn real orthogonal matrices, and let O 
be from the Haar measure on 0(n,M). Let A be a fixed nx n real matrix, 
satisfying the constraint Tr{AA t ) = n, and let W = Tr(AO). Letting 
(fr(x) = —h= f x e~ l l 2 dt denote the cumulative distribution function of a 

V ; v / 27r J -° 

standard normal random variable, a result of D'Aristotile, Diaconis and 
Newman [6] is that 

sup Tr(AA t )=n \¥(W <x)- <Sf[x)\ -> 

— oo<;r <oo 

as n — > oo. A recent paper of Meckes [23] shows that for all n > 2, the total 
variation distance between the law of W and a standard normal is at most 
^| . In this paper we use a very different construction than that of Meckes 

and obtain a slightly better total variation bound of . 

As Meckes observes, this problem has quite a bit of history. Borel in 0] 
showed that if A" is a random vector on the n — 1 dimensional unit sphere, 
with first coordinate X\, then P(y / nXi < t) — > <&(i) as n — > oo. Since the 
first column of a Haar distributed orthogonal matrix is uniformly distributed 
on the sphere, Borel's theorem follows from the central limit theorem for W, 
taking A = y/n@ 0. Various generalizations of Borel's theorem, focusing on 
blocks of entries of a Haar distributed orthogonal matrix, can be found in 
the papers 0], @], \v^. 

In the special case that A = J, W becomes the trace of a Haar distributed 
orthogonal matrix. Diaconis and Mallows (see 0]) proved that Tr(0) is 
approximately normal. Stein [271] proved there exists a constant C r so that 
the total variation distance between Tr(0) and a standard normal is at most 
C r {n — l) _r , and Johansson [ItJ proved a central limit in the Kolmogorov 
metric with error term 0(e~ cn ) for some c > 0. Diaconis and Shahshahani 
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[Io| proved multivariate central limit theorems (without error terms) for 
the joint limiting distribution of Tr(0),Tr(0 2 ), ■ ■ ■ ,Tr(O k ), with k fixed. 
Fulman used heat kernel methods to prove central limit theorems with 
error terms for Tr(O k ) with k growing; this was extended to the multivariate 
setting by Dobler and Stolz [111 ]. 

Another reason for studying Tr(AO) is the similarity with Hoeffding's 
combinatorial central limit theorem [HI], which proves a central limit theo- 
rem for Tr{AP) where P is a random permutation matrix. Stein's method 
has been used to give explicit bounds for Hoeffding's theorem; see [l[ or (ij. 

It should be possible to extend the results in the current paper to a mul- 
tivariate setting. Indeed, Chatterjee and Meckes 0] prove bounds in the 
Wasserstein metric between the distribution of Tr{A\0), • • • , Tr^A^O) and 
a multivariate normal, where A\ , ■ ■ ■ ,A^ are fixed and O is from Haar mea- 
sure of the orthogonal group; see also Collins and Stolz [B[] for an extension 
of 0] (without error terms) to compact symmetric spaces. 

Extensions of our results to the unitary and symplectic groups appear in 
the companion paper [la ], which slightly improves Meckes' constant in the 
unitary case [23J]. For the unitary groups there is also an interesting recent 
paper |19|] which uses characteristic functions and a heavy dose of analysis to 
prove a central limit theorem for Tr(AU) with error term 0(n~ 2+b ) where 
< b < 1 depends on the leading order asymptotics of the greatest singular 
value of A. 

Although our results are only a slight improvement of those of Meckes [23J] , 
the heat kernel is a remarkable tool appearing in many parts of mathematics 
(see [l8| for a spirited defense of this statement with many references), and 
we suspect that the blending of heat kernel techniques with Stein's method 
will be useful for other problems. 

The organization of this paper is as follows. ISection "21 gives background on 
symmetric functions and the orthogonal group. ISection "31 gives background 
on the heat kernel and Laplacian of the orthogonal group. ISection 4l uses 
tools from ISection 21 and ISection 31 to prove our main results. 



2. Symmetric functions and the orthogonal group 

In this paper we use the zon al p olynomial Z\ (with parameter 2) defined 
in Section 7.2 of Macdonald [22]. To show the usefulness of symmetric 
functions, we give a quick proof that W = Tr(AO) is asymptotically normal, 
if O is from Haar measure of the orthogonal group and A satisfies AA t = n. 
As noted in Meckes [23[] one can assume without loss of generality that A 
is diagonal: let A = UDV be the singular value decomposition of A. Then 
W = Tr(UDVO) = Tr(DVOU), and the distribution of VOU is the same 
as the distribution of O by the translation invariance of Haar measure. 

The key to proving a central limit theorem for W is the following lemma 
from page 423 of [22] (which should also prove useful for carrying the re- 
sults of [19( over to the orthogonal case). For its statement, recall that the 
hooklength of a box x is 1 + number of boxes in same row as x to right of 
x + number of boxes in same column of x beneath x. In the diagram below 
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representing a partition of 7, each box is filled with its hook length: 

I] H H E 

T] IT 
i 

Lemma 2.1. Let Z\ be the zonal polynomial (with parameter 2) and let 
h(2X) be the product of the hooklengths of the partition 2\, whose rows have 
length twice those of X. Let A have singular values ai, • • • ,a n . Then 



L 



0(n,R) 



Lemma [2.11 immediately implies that E(W) = 0. Since by page 410 of [2 
one has that Z\(a\, • • • , a^) = a\ + • • • = n, it follows that Var(W) = 1. 
Lemma |2. II also implies the following central limit theorem for W. 

Corollary 2.2. Let A satisfy Tr(AA l ) = n, and let O be from the Haar 
measure of the orthogonal group 0(n,M). Then as n —> oo, W = Tr(AO) 
tends to the standard normal distribution with mean and variance 1. 

Proof. Lemma [2J] implies that E(Ty r ) = for r odd, so suppose that r is 
even. From page 409 of |22j], 

Z x (l,--- ,1) = J] (n~i + 2j-l). 
(i,i)eA 

With A fixed this is asymptotic to n' A '(l + 0(l/n)). 

Recall we can assume that A is diagonal with entries (a\, ■ ■ ■ ,a n ). Hence 
by Lemma [2 .11 K[Tr(AO) r ], with A fixed, r fixed and even, is asymptotic to 

nr/2 h{2X) 



^ T/\r/2)\Z x (al--- ,a 



J) 



n r/2 2 r/2( r M)| h(2X) 
v ' ; |A|=r/2 v ' 

? , .. . ,„2W2 



n r/2 2 r/2( r / 2 )! 
T ! 

2 r / 2 (r/2)!' 



(af + ha 



The penultimate equality was from page 406 of 22]. 

It follows that for fixed r, E(W r ) is for r odd, and is asymptotic to 
(r — 1) • • • (3)(1) for r even. The method of moments ([7]) implies that W is 
asymptotically normal with mean and variance 1. □ 

It will also be useful to work with Schur functions s\(AO) evaluated on the 
eigenvalues of AO. An in-depth treatment of Schur functions is in Chapter 
1 of [12]. From pages 421-422 of 22], one can express the integral of a Schur 
function over the orthogonal group in terms of zonal polynomials as follows: 
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Lemma 2.3. Let s\ be the Schur function and Z\ be the zonal polynomial 
with parameter 2. Then for any partition A of length < n, 

/ S X (AO)dO= Z K (l,-,l) V 

Jo(n,R) otherwise 

The power sum symmetric functions p\ will also be useful. To define these, 
given a matrix M, if A is an integer partition and rrij denotes the multiplicity 
of part j in A, we set p\(M) = Y\-Tr{M 3 ) m j . For example, P5,3,3(M) = 
Tr(M 5 )Tr(M 3 ) 2 . Sometimes we suppress the M and use the notation p\. 
Lemma \'2A\ from page 114 of expresses power sum symmetric functions 
in terms of Schur functions s\, and will be used later in the paper. 

Lemma 2.4. Let Xp denote the value of the irreducible character of the 
symmetric group parameterized by A on the conjugacy class of elements of 
type p. Then 

Pp = Y< x p Sx - 

x 

3. Heat kernel and Laplacian of the orthogonal group 

Recall that a pair (W, W) of random variables is called exchangeable if 
(W, W) has the same distribution as (W, W). To construct an exchangeable 



pair to be used in our applications, we use the heat kernel of G. See [14J], [2 



for a detailed discussion of heat kernels on compact Lie groups. The papers 



[2Q| , 2l( , 24| illustrate combinatorial uses of heat kernels on compact Lie 
groups, and [lit] also discusses the use of the heat kernel for finite groups. 
The heat kernel on G is defined by setting for x,y £ G and t > 0, 

(1) K(t, x,y) = J2 e- Xnt M^)My), 

n>0 

where the X n are the eigenvalues of the Laplacian repeated according to 
multiplicity, and the 4> n are an orthonormal basis of eigenfunctions of L 2 (G); 
these can be taken to be the irreducible characters of G. 

We use the following properties of the heat kernel. Here A denotes the 
Laplacian of G, and e' A is defined as I + tA + 1 2 ^- + • • • . Part 2 of Lemma 
13.11 is immediate from the expansion (pQ) , and parts 1 and 3 of Lemma 13.11 



are on page 198 of [14]. 



Lemma 3.1. Let G be a compact Lie group, x,y £ G, and t>0. 

(1) K(t,x,y) converges and is non-negative for all x,y,t. 

(2) f yeG K(t,x,y)dy = 1, where the integration is with respect to Haar 
measure of G. 

(3) e tA (p(x) = f yeG K(t,x,y)(j)(y)dy for smooth <p. 

The symmetry in x and y of K (t, x, y) shows that the heat kernel is a 
reversible Markov process with respect to the Haar measure of G. It is a 
standard fact |25| , [28| that reversible Markov processes lead to exchangeable 
pairs (W 3 W). Namely suppose one has a Markov chain with transition 
probabilities K(x,y) on a state space X, and that the Markov chain is 
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reversible with respect to a probability distribution ir on X. Then given 
a function / on X, if one lets W = f{x) where x is chosen from ir and 
W = f(x') where x' is obtained by moving from x according to K(x,y), 
then (W, W) is an exchangeable pair. In the special case of the heat kernel 
on a compact Lie group G, given a function / on G, one can construct an 
exchangeable pair (W, W) by letting W = f(0) where O is chosen from 
Haar measure, and W = f(0'), where O' is obtained by moving time t from 
O via the heat kernel. To define the exchangeable pair (W, W) used in this 
paper, we further specialize by setting f(0) = Tr(AO). 

To analyze the heat kernel on the orthogonal groups, we need to under- 
standing the corresponding Laplacian. Proposition 2.7 of the paper [2o[ gave 
an explicit description of the Laplacian for SO(n,M). The same calculations 
work for 0(n,M) (which shares the same Lie algebra with SO(n,M)), and 
yield the following result. 

Lemma 3.2. Let A satisfy Tr [A A 1 ) = n. Then 
(1) 

(n — I) 

A 0{n)Pl (AO) = -VL—L^AO). 

(2) 

&0(n)Pi,i( AO ) = -(n- 1)01,1 (AO) -p 2 (AO) + n. 
4. Main results 

To begin we describe the exchangeable pair (W,W). Namely W = 
Tr(AO) = pi(AO), where as explained earlier one can assume A is diag- 
onal. We fix t > 0, and motivated bv lSection 31 define 

W' = e tA {W) = W + ^2 w^W)- 

k>l 

Lemma 14.11 computes the conditional expectation E[W'|OJ. 
Lemma 4.1. 

E[W'\0] = (l- * (n ~ 1) ) W + 0(t 2 ). 

Proof. Applying part 3 of Lemma 13.11 and part 1 of Lemma 13.21 

E[W'\0] = e tA {W) 

= W + tAW + 0(t 2 ) 

= w + t ~ {n ~ l) w + o{t 2 ), 

as desired. □ 

Lemma S3] computes E[(W - W) 2 \0}. 
Lemma 4.2. 



E[(W' - W) 2 \0\ = t[n - P2 (AO)} + 0(t 2 ). 
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Proof. Clearly 

E[(W - Wf\0] = E[(W') 2 \0] - 2WE[W'\0] + W 2 . 
By part 3 of Lemma 13.11 and part 2 of Lemma 13,21 
E[(W') 2 \0] = W 2 + tA Phl (AO) + 0(t 2 ) 

= W 2 + t ]-(n - l)p hl (AO) - p 2 (AO) + n] + 0(t 2 ). 
By Lemma El -2WTE[jy'|0] is equal to 

-2W 2 + t(n - l)pi,i(AO) + 0{t 2 ). 

Thus 

E[(W') 2 \0] - 2WE[W'\0] + W 2 = t[n - p 2 (AO)] + 0(t 2 ). 

□ 

Lemma 14.31 bounds the variance of p2(AO). 
Lemma 4.3. Suppose that n > 4. Then Var\p2(A0)] < 2. 



Proof. By definition, Var\p 2 (AO)] = E\p 2t2 (AO)] - (E[p 2 (AO)}) 2 . 
From Lemma [23 



p 2 (AO) = -s {1>1) (AO) + s 2 (AO). 

Then Lemma [2731 gives that f ( n R) ~ s {l,i)(AO)dO = and (combined with 
page 410 of that 

/ S2 (AO)do = Zi{al ---^ = a i + -+°a = L 

Thus E[ P2 (AO)} = 1. 

To compute E[p2,2(^40)]> Lemma [2741 gives that 

( Art\ 4 , (3,1) , (2,2) (2,1,1) (1,1,1,1) 

P2, 2 (^0) = X(2,2) S 4+X ( 2,2) S (3,1)+X ( 2,2) S (2,2)+X (2 ,2) S (2,l,l) +X (2 , 2 ) 8 (1.1,1,1)- 

Lemma 12.31 and the character table of S4 then give that 

Eb(2, 2 )(^0)] = E[ x ^ 2)Si {AO)+xf 2 %H2,2)(AO)} 
= E[s A {AO) + 2s^ 2) {AO)} 

Z 2 (aj,--- ,a 2 n ) Z (1)1) (af,--- ,0=) 



+ 2- 



Z 2 (l, ■■■,!) Z (1>1) (1,--- ,1) 

From pages 382 and 383 of [23], it follows that 

Z 2 (a 2 ,--- ,a 2 ) = n 2 + 2(af + ---+q4) 
Z 2 (l,---,1) ra 2 + 2n 

and that 

Z(l,i)( a l~' X) _ n 2 -(af + ••• + <) 



Z(i,i)(l,--- ,1) n 2 -n 
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Var\p 2 (AO)\ 



n 



+ 



2n 2 



1 



-(cijH ha 



n 



2 — n n 2 + 2n 



n 2 + 2n n 2 — n 

By the method of Lagrange multipliers, a\ + • • • + a 4 is minimized subject 
to the constraint a 2 + • • • + a 2 = n when a± = ■ ■ ■ = a n = 1. But 

- 2 2n 2 



n 



n 2 + 2n 



+ 



1 



— n 



n 



n 2 — n 



n 2 + 2n 



implying that Var[p 2 {AO)} < 2, as claimed. □ 
Next we compute expected values of low order moments of W — W. 

Lemma 4.4. Suppose that n > 4. Then 

(1) E(W - W) 2 = t{n - 1) + 0{t 2 ). 

(2) E(W-T^) 4 = 0(t 2 ). 

(3) E\W -W\ 3 = 0{t 3 / 2 ). 

Proof. Lemma implies that E(W' - W) 2 = tE [n - p 2 (^0)] + 0(t 2 ). By 
the proof of Lemmata E[p 2 (AO)] = 1. Thus 

E(W' - W) 2 = i(n - 1) + 0(t 2 ), 

as claimed. 

For part 2, first note that since 

E^W'-W) 4 ] = E{W i )-m{W 3 W')+m,[W 2 {W') 2 ]-m[W{W') 3 ]+E[{W')% 

exchangeability of (W, W) gives that 

E(VF'-VF) 4 = 2E(W A )-8E{W 3 W') + 6E[W 2 {W') 2 ] 

= 2E(W 4 ) - 8E[W 3 E[W'\0}] + 6E[VF 2 E[(iy / ) 2 |0]]. 

By Lemma 14.11 

^-^w + oit 2 ), 



E[W'\0] = 
and by the proof of Lemma 14.2 



Thus 



E[(W'Y\0] = W z + t [-(n - l)pi,i(AO) - p 2 {AO) + n] + 0(t z 

E(W' - W) A 
= 2E{W A ) - 8E(W 4 ) + QE(W A ) 

+t [4(n - 1)E{W 4 ) + 6E[W 2 [-{n - l)p h i(AO) - p 2 (AO) + 
+0(t 2 ) 

= t [-2(n - l)E[ Plil)ljl (AO)] - 6E[p 2)lil (^0)] + 6n] + 0(i 2 ), 

where we used that E(W 2 ) = 1. 

From Lemma 12.41 pi,\ i,i(AO) is equal to 

4 , (3,1) . (2,2) (2,1,1) (1,1,1,1) 

X(l,l,l,l) S 4 + X(i ) i ) i ] i)S( 3 ,l) +X(1,1,1,1) S (2,2) +X(1,1,1,1) S (2,1,1) 
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From Lemma 12.31 and the fact that xf\ 1 1 1) = 1 an d Xn'i i n = 2, it follows 
that 



E\p hlxl (AO)] = E[s 4 (AO) + 2s 2 , 2 (AO)} 

Z 2 (al,--- ,al) +2 _Zi,i(a?, 



a 2 ) 



Z 2 (l,--- ,1) ' Z ltl (l,--- ,1) ' 

where Z\ denotes the zonal polynomial with parameter two. 
Similarly, p2,i t i(AO) is equal to 

4 , (3,1) , (2,2) (2,1,1) (1,1,1,1) 

X(2,1,1) S 4 + X( 2 ,l,l) s (3,l) + X( 2 ,l,l) s (2,2) + %l]»(2,l,l) + X(2,l,l) S (l,l,l,l)' 

From Lemma 12.31 and the fact that xf 2 1 1) = ^ anc ^ Xp'i^i) = ^' follows 
that 



E[p2,i,i(AO)] =E[s 4 (^0)] 



^2 (a?, 



Z 2 (l," - ,1) 

with the zonal polynomial with parameter 2. 
As in the proof of Lemma 14.31 one nas that 

Z 2 (al-.. ,al) _ n 2 + 2(af + --- + af l ) 



n 2 + 2n 



and that 



z 2 (i,--- ,1) 

Z(i,i)(aj,--- ,Qn) 

^(l,l)(l,--- ,1) n 2 -n 



n 2 - (of H h a 4 



Plugging in these values, one obtains that 
V + 2(af + • • • + a 4 ; 



-2(n- l)t 



-6t 



n 2 + 2n 
n 2 + 2(a 4 + --- + a 4 )' 
n 2 + 2n 



+ 



2(n 2 



+ 4)) 



n" — n 



+ 6ni + 0(t 2 



0(i 2 ), 



proving part 2 of the theorem. 

For part 3 of the theorem, one uses the Cauchy-Schwarz inequality to 
obtain that 

E\W' - W\ 3 < ^E(W' - W) 2 E(W - Wy. 
Part 3 then follows from parts 1 and 2 of the theorem. □ 

Lemma 4.5. Let f be a twice differentiable function with bounded second 
derivative. Then 



K[f'(W) - Wf{W)\ = E 



n — 1 



Proof. Since (W, W) is an exchangeable pair, 
= E[(W -W)[f(W) + f{W')}} 

= E[(W'-W)[2f(W) + [f(W')-f(W)]]] 

= 2E[(W' - W)f{W)\ + E[(W - W){f{W) - f{W))}. 
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Now by Lemma 14,11 

2E[(W' -W)f(W)] = 2E[f(W)E[(W' - W)\0)\ 

= -t{n-l)E[f(W)W} + 0(t 2 ). 

By Taylor's theorem and Lemma 14.21 it follows that 

E[(W' - W){f{W') - f(W))] = E[(W'- W)[f'(W)(W - W)] + R] 

= E[f'(W)E[(W'-W) 2 \0} + R] 
= E[tf'(W)[n-p 2 (AO)] + R} + 0(t 2 ), 

with \R\ < ^ 2 - \ W — W\ 3 , where || • || denotes the supremum norm. From 
part 3 of Lemma S3] it follows that R = 0(t 3 / 2 ). 
Summarizing, we have that 

= 2E[(W' -W)f(W)]+E[(W' -W){f(W) - f{W))] 
= -t{n - l)E[f(W)W] + tE[f'(W)[n - p 2 (AO)]] + 0(t 3 / 2 ). 

Dividing both sides of this equation by t and then letting t — > completes 
the proof. □ 

Now we prove our main result. 

Theorem 4.6. The total variation distance between W and a standard nor- 
mal is at most ^rf- 

Proof. Note that the total variation distance between two random variables 
W and Z can be described as 

d TV (W, Z) = su Ph \Eh{W) - Eh(Z)\, 

where the supremum ranges over all h that are continuous and bounded from 
below by and from above by 1 and have compact support. 

Any such function h can be approximated by C°° functions in the supre- 
mum norm. For fixed h let h m be a sequence of C°° functions such that 
\\h — n m\ \ — > as m — > oo where || • || denotes the supremum norm. 

Let f m be the solution to the Stein equation 

f' m (x) - xf m (x) = h m (x) - Eh m (Z), 

where Z has the standard normal distribution. Since we approximate h 
by a compactly supported C°° function, Formula (47) on page 25 of (28| 
allows us to assume that ||/ m || is bounded. Applying Lemma l4~5l (so taking 
T(AO) = P2 ^^' 1 ~ 1 in the formulas below), we have 
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\Eh(W) - Eh(Z)\ 

< 2\\h-h m \\ + \Eh m (W)-Eh m (Z)\ 

< 2\\h-h m \\ + \E[f^(W)-Wf m (W)}\ 

< 2\\h-h m \\ + y / VoJ(f)\\f^\\ 

< 2\\h-h m \\ +2y/Var{T)\\h m -Eh m (Z)\\ 

< (2 + Ay/Var{T))\\h -h m \\+ 2^/Var{T)\\h - Eh{Z)\\. 

(The inequality ||/^J| < 2\\h m — Eh m (Z)\\ is Formula (46) on page 25 of 
Stein Q). 

Letting m — > oo we therefore have 



\Eh(W) - Eh(Z)\ < 2y/V ar(T)\\h - Eh(Z)\\ 

for any continuous function h. As \\h — Eh(Z)\\ < 1 and Var{T) < ^jra 
by Lemma 14.31 the total variation bound now follows. □ 

Acknowledgements 

Fulman was partially supported by NSF grant DMS 0802082. We thank 
Thierry Levy for help computing with the orthogonal group Laplacian. 

References 

[1] Bolthausen, E., An estimate of the remainder in a combinatorial central 

limit theorem, Z. Wahrsch. Verw. Gebiete 66 (1984), 379-386. 
[2] Borel, E., Sur les principes de la theorie cinetique des gaz, Annales de 

I'ecole Normal Sup. 23 (1906), 9-32. 
[3] Chatterjee, S. and Meckes, E., Multivariate normal approximation using 

exchangeable pairs, A LEA Lab. Am. J. Probab. Math. Stat. 4 (2008), 

257-283. 

[4] Chen, L. H. Y., Goldstein, L., and Shao, Q., Normal approximation by 
Stein's method. Probability and its Applications (New York). Springer, 
Heidelberg, 2011. 

[5] Collins, B. and Stolz, M., Borel theorems for random matrices from the 
classical compact symmetric spaces, Ann. Probab. 36 (2008), 876-895. 

[6] D'Aristotile, A., Diaconis, P., and Newman, C, Brownian motion and 
the classical groups, in Probability, statistics and their applications: pa- 
pers in honor of Rabi Bhattacharya, 97116, IMS Lecture Notes Monogr. 
Ser., 41, Inst. Math. Statist., Beachwood, OH, 2003. 

[7] Diaconis, P., Application of the method of moments in probability and 
statistics. Moments in mathematics (San Antonio, Tex., 1987), 125- 
142, Proc. Sympos. Appl. Math., 37, Amer. Math. Soc, Providence, 
RI, 1987. 

[8] Diaconis, P., Eaton, M., and Lauritzen, S., Finite de Finetti theorems 
in linear models and multivariate analysis, Scand. J. Statist. 19 (1992), 
289-315. 



STEIN'S METHOD, HEAT KERNEL, AND LINEAR FUNCTIONS 



11 



[9] Diaconis, P. and Freedman, D., A dozen de Finetti style results in search 
of a theory, Ann. Inst. H. Poincare Probab. Statist. 23 (1987), 397-423. 

[10] Diaconis, P. and Shahshahani, M., On the eigenvalues of random matri- 
ces. Studies in applied probability. J. Appl. Probab. 31 A (1994), 49-62. 

[11] Dobler, C. and Stolz, M., Stein's method and the multivariate CLT 
for traces of powers on the classical compact groups, I arXiv: 1 012.3730 
(2010). 

[12] Fulman, J., Stein's method, heat kernel, and traces of powers of elements 
of compact Lie groups. larXiv:10 05.1306 (2010). 

[13] Fulman, J., Stein's method, heat kernel, and linear functions on the 
unitary and symplectic groups, preprint. 

[14] Grigor'yan, A., Heat kernel and analysis on manifolds, AMS/IP Studies 
in Advanced Mathematics, 47. American Mathematical Society, Provi- 
dence, RI; International Press, Boston, MA, 2009. 

[15] Hoeffding, W., A combinatorial central limit theorem, Ann. Math. Sta- 
tistics 22 (1951), 558-566. 

[16] Jiang, T., Maxima of entries of Haar distributed matrices, Probab. The- 
ory Related Fields 131 (2005), 121-144. 

[17] Johansson, K., On random matrices from the compact classical groups, 
Ann. of Math. 145 (1997), 519-545. 

[18] Jorgenson, J. and Lang, S., The ubiquitous heat kernel. Mathematics 
unlimited — 2001 and beyond, 655-683, Springer, Berlin, 2001. 

[19] Keating, J. P., Mezzadri, F. and Singphu, B., Rate of convergence of 
linear functions on the unitary group, J. Phys. A 44 (2011), no. 3, 
035204, 27 pp. 

[20] Levy, T., Schur-Weyl duality and the heat kernel measure on the unitary 
group, Adv. Math. 218 (2008), 537-575. 

[21] Liu, K., Heat kernels, symplectic geometry, moduli spaces and finite 
groups, in Surveys in differential geometry: differential geometry in- 
spired by string theory, 527-542, Surv. Differ. Geom., 5, Int. Press, 
Boston, MA, 1999. 

[22] Macdonald, I. G., Symmetric functions and Hall polynomials. Second 
edition. The Clarendon Press, Oxford University Press, New York, 1995. 

[23] Meckes, E., Linear functions on the classical matrix groups, Trans. 
Amer. Math. Soc. 360 (2008), 5355-5366. 

[24] Rains, E. M., Combinatorial properties of Brownian motion on the com- 
pact classical groups, J. Theoret. Probab. 10 (1997), 659-679. 

[25] Rinott, Y. and Rotar, V., On coupling constructions and rates in the 
CLT for dependent summands with applications to the antivoter model 
and weighted U-statistics, Ann. Appl. Probab. 7 (1997), 1080-1105. 

[26] Rosenberg, S., The Laplacian on a Riemannian manifold. An introduc- 
tion to analysis on manifolds. London Mathematical Society Student 
Texts, 31. Cambridge University Press, Cambridge, 1997. 

[27] Stein, C, The accuracy of the normal approximation to the distribu- 
tion of the traces of powers of random orthogonal matrices. Stanford 
University Statistics Department technical report no. 470, (1995). 



12 



JASON FULMAN AND ADRIAN ROLLIN 



[28] Stein, C, Approximate computation of expectations. Institute of Math- 
ematical Statistics Lecture Notes-Monograph Series, 7. Institute of 
Mathematical Statistics, Hayward, CA, 1986. 

Department of Mathematics, University of Southern California, Los Ange- 
les, CA, 90089, USA 

E-mail address: fulman@usc.edu 

Department of Statistics and Applied Probability, National University of 
Singapore, Singapore 117546 

E-mail address: adrian.roellin@nus.edu.sg 



